Distributed vector architectures
نویسنده
چکیده
Integrating processors and main memory is a promising approach to increase system performance. Such integration provides very high memory bandwidth that can be exploited efficiently by vector operations. However, traditional vector applications would easily overflow the limited memory of a single integrated node. To accommodate such workloads, we propose the DIstributed Vector Architecture (DIVA), that uses multiple vector-capable processor/memory nodes in a distributed shared-memory configuration, while maintaining the simple vector programming model. The advantages of our approach are twofold: (i) we dynamically parallelize the execution of vector instructions across the nodes, (ii) we reduce external traffic, by mapping vector computation—rather than data—across the nodes. We propose run-time mechanisms to assign elements of the architectural vector registers on nodes, using the data layout across the nodes as a blueprint. We describe DIVA implementations with a traditional request-response memory model and a data-push model. Using traces of vector supercomputer programs, we demonstrate that DIVA generates considerably less external traffic compared to single or multiple-node alternatives that are based solely on caching or paging. With timing simulations we show that a DIVA system with 2 to 8 nodes is up to three times faster than a single node using its local memory as a large cache and can even outperform a hypothetical system where the application fits in local memory.
منابع مشابه
A Discussion on Parallelization Schemes for Stochastic Vector Quantization Algorithms
This paper studies parallelization schemes for stochastic Vector Quantization algorithms in order to obtain time speed-ups using distributed resources. We show that the most intuitive parallelization scheme does not lead to better performances than the sequential algorithm. Another distributed scheme is therefore introduced which obtains the expected speedups. Then, it is improved to fit implem...
متن کاملDecision Fusion in Distributed Detection and Bioinformatics
Decision Fusion in Distributed Detection and Bioinformatics Yingqin Yuan Moshe Kam, Ph.D. This thesis describes decision fusion architectures and demonstrates decision fusion applications in bioinformatics. In the first part of the thesis, we investigate a new architecture for distributed binary hypothesis detection where all local detectors share a common channel to communicate with the decisi...
متن کاملComparative performance analysis of uniformly distributed applications
A simple programming model of distributed-memory message-passing computer systems is rst applied to describe the couple architecture/application by two sets of parameters. The node timing formula is then derived on the basis of scalar, vector and communication components. A set of suitability functions, extracted from the performance formulae, are deened. These functions are shown to be particu...
متن کاملGenerative Paragraph Vector
The recently introduced Paragraph Vector is an efficient method for learning highquality distributed representations for pieces of texts. However, an inherent limitation of Paragraph Vector is lack of ability to infer distributed representations for texts outside of the training set. To tackle this problem, we introduce a Generative Paragraph Vector, which can be viewed as a probabilistic exten...
متن کاملAdvanced Numerical Methods for Numerical Weather Prediction
The long term goal of this research is to explore new numerical methods for the next generation global atmospheric model. The reason for considering new methods is due to the paradigm shift in high performance computing from vector computers to distributed memory machines. To take full advantage of the new architectures the global domain must now be partitioned into subdomains/ elements that ca...
متن کاملCovariance Analysis of a vector tracking GPS receiver based on MMSE multiuser Detection
In high dynamic conditions, using vector tracking loops instead of scalar tracking loops in GPS receivers is proved as an efficient method to compensate the performance. The Minimum Mean Squared Error detector as a multiuser detector is applied in the vector tracking loop for more reliability and efficiency. The Kalman filter does the two tasks of tracking and extracting the navigation data aft...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of Systems Architecture
دوره 46 شماره
صفحات -
تاریخ انتشار 2000